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Abstract 

Four  groups  of  college  studeots  were  each  given  two  base-race 
ptoblias.  Three  of  the  groups  were  given  an  aid  with  the  firct 
problem:  Instruction  Co  list  factors  or  aspects  chat  were 

relevant  to  solving  the  problem,'  (b)  a  flll-ln-che-blank  algorithm 

■O  ■ 

that  provided  the  correct  solution;  or  <-c)  a  seven-page  tutorial 
that  explained  base-rate  problems  and  showed  how  to  solve  them 
using  a  2  X  2  table.  No  aid  was  provided  for  the  second  problem. 

The  control  group  replicated  previous  findings  in  disregarding  the 

s- 

base-rate  Information.  The  list  factors'  group  showed  no 
improvement  over  Che  control  group.  The  algorithm  group  showed 
distinctly  better  performance  for  the  first  problem  but  were  the 
same  as  the'  control  group  for  the  second  problem.  The  tutorial 
group  did  best:  42Z  of  answers  to  the  first  problem  and  31Z  of 
answers  to  the  second  problem  were  within  +  .10  of  Che  correct 
answer.  An  error  analysis  Identified  a  conceptual  weakness  in  the 
tutorial;  a  high  rate  of  arithmetic  errors  was  also  found.  College 
students  appear  to  lack  the  knowledge  needed  Co  solve  base-rate 
problems  but  they  can  be  taught  this  knowledge  relatively  easily. 
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Structuring  as  an  Aid  to  Performance  In  Base-Rate  Problems 
A  frequently-studied  class  of  inference  problems  requires  the 
of  two  kinds  of  prebablllstlo  ■fnforwaf -f^n ^  ^*9e— rate 
information,  that  is,  information  about  the  population  of  events, 
and  diagnostic  information,  chat  is,  information  about  Che  specific 
event  being  considered.  The  base-rate  fallacy  is  the  tendency  for 
people  Co  disregard  base  races  when  given  these  inference  problems. 
For  example,  consider  Che  following  story  problem: 

Two  companies  operate  in  a  given  city,  the  Blue  and 
Che  Green  (according  Co  Che  color  of  cab  they  run). 

Eighty-five  percent  of  the  cabs  in  the  city  are  Blue  and 
Che  remaining  ISZ  are  Green.  A  cab  was  Involved  in  a 
hit-and-run  accident  at  night.  A  witness  later 
Identified  the  cab  as  a  Green  cab.  The  court  tested  the 
witness'  ability  to  distinguish  between  Blue  and  Green 
cabs  under  nighttime  visibility  conditions.  It  found 
that  the  witness  was  able  to  identify  each  color 
correctly  about  80Z  of  the  time,  but  confused  it  with  the 
other  color  about  20Z  of  the  time. 

What  do  you  think  are  the  chances  that  the  errant 
cab  was  indeed  Green,  as  the  witness  claimed? 

(Bar-Hillel,  1980,  pp.  211-212). 

In  response  to  this  problem,  which  is  becoming  something  of  a 
classic,  most  subjects  answer  80%.  Similar  responses  have  been 
found  for  story  problems  that  are  structurally  similar  but  have 
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different  cover  stories  (e.g.,  Lyon  &  Slovlc,  1976).  The 
nornatlvely  correct  answer,  derivable  by  Bayes'  Theoreo,  Is  a 
probabilistic  nerglng  of  both  pieces  of  information  provided  in  the 
story,  resulting  in  a  probability  of  .41.  The  subjects'  response 
of  .80  indicates  a  reliance  on  the  diagnostic  information  given  In 
the  story  (here,  the  witness'  testimony)  and  a  disregard  for  the 
base-rate  information  (here,  the  relative  number  of  each  color  of 
cab  in  the  city). 

Subjects  do  not  always  disregard  base  rates.  Research  has 
suggested  that  they  do  so  only  when  they  believe  that  the  base-rate 
Information  is  not  relevant  (Bar-Hillel,  1980).  Such  information 
can  be  made  to  seem  more  relevant,  for  example,  by  changing.  In  the 
above  story,  the  information  "SSZ  of  the  cabs  in  the  city  are  Blue” 
to  "BSZ  of  the  cab  accidents  In  the  city  Involve  Blue  cabs” 

(Tversky  &  Kahneman,  1980).  This  wording  apparently  evoked  a 
causal  link  between  the  population  of  cabs  and  the  accident  being 
considered.  This  causal  connection  heightened  the  apparent 
relevance  of  the  base  rate  (see  also  AJzen,  1977). 

Host  of  the  research  on  the  base-rate  fallacy  has  focused  on 
variations  in  the  stories,  rather  than  on  changing  the  subjects 
(Bar-Hillel,  1983;  Tversky  &  Kahneman,  1982).  In  contrast,  the 
focus  of  the  present  study  was  to  explore  the  effect  of  different 
kinds  of  aids  that  might  help  the  subjects  overcome  the  base-rate 
fallacy.  One  approach  that  has  been  tried  is  to  present  the 
subjects  with  experience,  via  slides  sequentially  presenting  the 


Structuring  Base  Rates 


5 

population  of  cases  to  the  subjects  (Christensen-Szalanski  &  Beach, 
1982).  That  approach  was  found  to  be  effective,  but  has  been 
criticized  on  the  grounds  that  the  subjects  did  not  have  to 
integrate  the  two  pieces  of  information  in  such  a  procedure;  they 
could  simply  count  the  relative  frequency  of  the  desired 
co-occurence  (Beyth-Marom  &  Arkes,  1983).  Moreover,  generalization 
of  the  improvement  to  other  problems  was  not  tested. 

A  class  of  aids  called  “focusing  techniques"  has  been  explored 
by  Fischhoff  and  his  colleagues  (Flschhoff,  Slovic  &  Lichtenstein, 
1979;  Flschhoff  &  Bar-Hlllel,  1984).  This  approach  uses 
instructions  (e.g.,  “If  you  only  knew  the  proportion  of  Green  cabs 
in  the  city,  what  would  you  think  is  the  probability  that  the  cab 
was  Green?“)  or  problem  variations  (e.g.,  presenting  the  same 
subject  with  three  cab  problems,  in  which  the  proportion  of  Green 
cabs  was  first  2Z,  then  98Z,  then  15Z)  to  focus  the  subject's 
attention  on  the  base-rate  information.  These  aids  did  improve 
performance,  in  the  sense  that  the  median  response  was  closer  to 
the  optimal  answer.  Unfortunately,  they  were  equally  effective  in 
changing  subjects'  responses  to  two  other  problems  which  were 
superficially  like  the  cab  problem  but  for  which  it  is  optimal  to 
disregard  the  "base-rate"  information.  For  these  problems, 
performance  was  worse  using  the  aid.  This  result  suggests  that  the 
focusing  techniques  used  in  that  research  did  not  improve  the 
quality  of  subjects'  thinking  about  the  problems;  rather,  they 
created  demand  characteristics  that  led  the  subjects  to  different 


responses . 
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The  present  paper  explores  Che  effectiveness  of  thi.ee  aids. 
The  first  is  a  simple  instruction  to  list  factors  that  might  be 
relevant  in  answering  the  question.  The  second  is  an  algorithm; 


subjects  were  given  the  correct  process  to  follow  to  solve  the 
problem,  in  the  form  of  a  flll-ln~che*-blanks  algorithm;  they  were 
not  cold,  however,  why  this  sec  of  calculations  was  correct.  Both 
these  aids  have  been  shown  to  be  effective  in  a  Cask  of  estimating 


unstructured  uncertain  quantities  such  as  "How  many  cigarettes  were 
sold  in  the  U.S.  last  year?”  (MacGregor,  Lichtenstein  &  Slovic, 
1984).  In  Chat  study,  Che  performance  of  subjects  given  the 
algorithm  was  greatly  superior  to  chat  of  a  control  group;  even  the 
"List  Factors"  group  showed  some  improvement.  Presumably,  the 
algorithm,  and,  to  a  lesser  extent,. the.  "List  Factors"  instruction, 


I 

k 

i 


helped  the  subjects  to  access  and  organize  their  knowledge. 

The  algorithms  previously  used  required  the  subjects  to 
estimate  some  quantities;  for  example,  in  the  Cigarette  algorithm 
subjects  had  Co  estimate  the  population  of  Che  U.S.,  Che  proportion 
who  smoke,  and  the  average  number  of  cigarettes  a  smoker  smokes  in 
one  day.  In  contrast,  the  algorithm  for  a  base-rate  problem 
requires  no  estimation.  To  use  it,  one  need  only  extract  from  the 
problem  the  necessary  information,  put  it  in  Che  appropriate 
spaces,  and  correctly  follow  the  instructions  for  arithmetic 
manipulations  on  the  numbers.  Thus,  we  would  expect  radical 
improvement  in  performance  when  the  algorithm  is  available. 
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One  difficulty  with  the  correct  solution  of  base-rate  problems 
Is  that  It  requires  understanding  of  a  moderately  complex 
Integration  rule.  Even  an  approximation  to  the  correct  answer 
requires  an  understanding  that  the  two  pieces  of  information  need 
to  be  played  off  against  each  ocher,  one  indicating  that  the 
desired  probability  is  high,  the  ocher  Chac  it  is  low.  The 
algorithm  here  used,  although  it  does  lead  to  the  correct  answer, 
may  not  illuminate  any  understanding  of  the  integration  process 
involved.  Without  such  understanding,  subsequent  performance  would 
be  expected  to  return  to  unaided  levels.  To  test  this  conjecture, 
we  presented  each  subject  a  second,  similar  base-rate  problem 
without  an  algorithm. 

For  our  final  aid,  we  wrotd  a  lengthy  tutorial  in  which  we 
tried  to  explain  both  how  to  do  base-rate  problems  and  why  our 
approach  was  correct.  Our  goal  was  to  teach  the  solution  to 
base-rate  problems  so  that  subjects  would  understand  the  process 
Involved. 

Method 

Subjects.  The  subjects  were  305  paid  volunteers  who  responded 
to  ads  In  the  University  of  Oregon  student  newspaper.  The  present 
tasks  were  completed  along  with  several  other  unrelated  paper-and- 
pencil  tasks  in  a  one-  to  two-hour  period.  Except  as  noted  below, 
all  subjects  were  run  in  groups  of  30  to  60  people  in  a  large 
university  classroom. 
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Design.  All  subjects  were  given  two  base-rate  problems,  here 
called  the  Lightbulb  problem  (adapted  from  Lyon  &  Slovlc,  1976)  and 
the  Dyslexia  problem;  both  are  shown  in  Table  1.  Approximately 
half  the  subjects  received  the  Lightbulb  problem  first;  the  others 
received  the  Dyslexia  problem  first.  The  two  administrations  were 
separated  by  two  unrelated  tasks.  For  all  subjects,  the  second 
problem  was  presented  In  Its  Control  form,  the  form  shown  In  Table 
1.  The  first  problem  was  presented  In  four  different  forms: 

1.  Control.  The  Control  form  was  given  to  A1  subjects. 

2.  List.  The  List  form  was  given  to  86  subjects.  In  the 
List  form,  after  the  problem  was  presented,  the  Instructions  read: 

Before  answering  the  question,  we  would  like  you  to  list 
the  things  one  should  consider  in  answering  this 
question.  These  things  could  be  a  list  of  factors  or 
components  that  would  be  useful  In  arriving  at  an  answer 
or  they  could  be  ways  for  going  about  arriving  at  an 
answer.  Make  your  list  here: 

[seven  blank  lines] 

Now,  answer  the  question: 

"What  is  the  probability  that  this  bulb  is  really 
defective?  [the  child  really  has  dyslexia]? 

You  can  probably  give  a  good  estimate  if  you  think 
hard  and  carefully. 


Answer 


'A'v/^;v.'v'.’v: 
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Insert  Table  1  about  here 


3.  Algorithm.  The  Algorithm  form,  given  to  76  subjects,  started 
with  this  instruction: 

In  this  task  we  would  like  you  to  work  through  a 
problem  by  carefully  following  a  number  of  detailed 
steps.  First,  you  %rlll  read  through  the  problem.  Then, 
you  will  follow  a  series  of  steps,  some  Chat  ask  you  to 
pull  Information  directly  from  the  problem  Itself,  and 
others  that  ask  you  to  carry  out  basic  arithmetic. 

Please  follow  all  the  directions  carefully.  Pay  special 
attention  to  the  accuracy  of  your  arithmetic.  This  Is 
not  a  test  of  your  ability  to  do  arithmetic,  but  accuracy 
of  computation  Is  essential  to  what  we  are  asking  you  to 
do. 

[The  problem  followed.] 

After  Che  problem  was  an  algorithm  composed  of  thirteen  steps,  as 
shown  for  the  Llghcbulb  problem  In  Table  2.  On  the  page  following 
the  algorithm,  two  additional  questions  were  asked: 

Do  you  chink  Che  answer  In  (M)  Is  a  sensible  answer  to 
the  question,  "What  is  the  probability  chat  this 
llghcbulb  is  really  defective  [the  child  really  has 
dyslexia]?  Yes  _  No  _ 

If  you  answered  No,  what  do  you  think  is  a  sensible 
answer? 
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Insert  Table  2  about  here 

4.  Tutorial.  The  Tutorial  form,  given  to  102  subjects,  was  a 
six  page,  single  spaced  essay.  The  seventh  page  presented  the 
problem  with  space  to  work  it  and  a  sununary  of  the  seven  steps  to 
solution  discussed  in  the  essay.  The  last  page  asked,  "Does  your 

answer  seem  sensible  to  you?  Yes _ No _ . ”  However,  unlike  the 

Algorithm  instructions,  a  more  sensible  answer  was  not  requested. 
Instead,  subjects  responding  ‘’No**  were  urged  to: 

.  .  .  review  the  steps  above.  You  may  have  made  an 
error  in  following  the  procedure  or  in  doing  the 
arithmetic.  Check  for  errors  and  correct  any  you  find. 

OR  it  may  be  that  your  intuitions  are  wrong  and  the 
procedure  is  correct.  Think  again  about  the  importance 
of  taking  Into  account  both  the  population  information 
and  the  specific  Information. 

The  tutorial,  shown  In  the  Appendix,  was  based  on  an  approach 
using  2x2  tables  rather  than  Bayes'  Theorem,  in  accordance  with 
Shaughnessy ' s  (1983)  view  that  2x2  tables  "help  people  focus  on 
the  restricted  sample  space  which  plays  so  vital  a  role  in 
conditional  probability  problems"  (p.  344;  emphasis  In  original). 

It  was  an  expansion  of  the  explanation  of  base-rate  problems  given 
by  Beyth-Marom,  Dekel,  Combo,  and  Shaked  (1985). 
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The  tutorial  vas  given  to  35  subjects  in  the  usual  large 
classroom  groups  and  to  67  subjects  who  were  run  in  small  groups 
(4-7  people),  vlth  fewer  other  tasks  and  with  small  battery-powered 
calculators  available  for  use. 

For  all  four  groups,  none  of  the  subjects  knew  when  they 
completed  the  first  form  that  they  would  later  be  given  the  second. 
Control  form  (one  Tutorial  subject  asked  the  experimenter  whether 
she  was  supposed  to  remember  it  all  and  was  told  no). 

Results 

Llghtbulb  vs .  Dyslexia.  In  order  to  compare  the  answers  given 
to  the  two  different  problems,  we  counted  the  number  of  correct 
answers  (for  this  count  we  required  two-digit  accuracy)  and  als. 
tallied  the  number  of  answers  for  each  problem  in  seven  categories: 

1.  Too  Low:  Answers  falling  more  than  .10  below  the  correct 
answer.  For  the  Llghtbulb  problem,  this  range  was  .00-. 30;  for 
Dyslexia,  .00-. 17. 

2.  About  Right:  Answers  that  were  within  .10  of  the  correct 
answer,  including  all  correct  answers.  For  the  llghtbulb  problem, 
this  range  was  .31-. 51;  for  Dyslexia,  .18-. 38. 

3.  Middling:  Answers  greater  than  .10  above  the  correct 
answer  but  below  the  dlagnosticity  (Llghtbulb,  .39-. 94;  Dyslexia, 
.52-. 79). 

4.  Diagnostic:  Answers  that  were  equal  to  the  dlagnosticity 
value  stated  In  the  problem  (Llghtbulb,  .80;  Dyslexia,  .95). 

5.  Way  High:  Answers  greater  than  the  dlagnosticity  but  not 
exceeding  1.00. 
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6.  Outside:  Negative  answers  and  answers  greater  than  I. 00. 

7.  None:  No  numerical  answer  given. 

A  comparison  of  the  distributions  of  responses  between  the  two 
problenis  showed  that  the  problem  with  the  larger  range  for  a  given 
category  had  more  reponses  in  that  category.  For  example,  across 
all  groups,  34Z  of  the  responses  fell  Too  Low  for  the  Light bulb 
problem  but  only  20Z  were  Too  Low  for  Dyslexia.  For  the  Dyslexia 
problem,  19Z  of  all  responses  were  Middling  whereas  only  6Z  were 
Middling  for  Lightbulb.  However,  the  response  categories  of 
special  Interest  had  equal  ranges  across  the  two  problems,  and  for 
these.  About  Right,  Diagnostic,  Outside,  and  None,  the 
distributions  for  the  two  problems  were  remarkably  similar.  Thus 
we  collapsed  the  data  across  the  two  problems. 

Large  vs.  small  groups.  The  tutorial  condition  was  given  In 
both  large  group  and  small  group  administration.  The  distributions 
of  responses  In  the  seven  categories,  collapsed  across  problems, 
did  not  differ  for  the  two  administrations.  Indeed,  exactly  the 
same  percentage  of  subjects  gave  the  right  answer.  We  thus 
collapsed  the  data  across  this  variable. 

Main  results.  The  primary  results  of  the  experiment,  the 
proportion  of  subjects  giving  answers  In  each  category  for  each 
group,  are  shown  In  Table  3.  The  percentage  of  exactly  correct 
responses  are  shown  in  parentheses  because  these  percentages  are 
Included  In  Che  About  Right  category.  The  first  column  gives 
results  for  the  Control  group  for  both  administrations;  thus  it  Is 
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based  on  two  responses  from  each  of  41  subjects.  The  columns 
labeled  '*2nd‘’  show  the  responses  to  the  second  administration  for 
the  other  three  conditions;  this  «as  always  the  Control  form. 


Insert  Table  3  about  here 


The  results  show  that  the  List  condition  had  no  effect.  Both 
the  first  (List)  and  second  (Control)  administrations  showed 
results  highly  similar  to  the  Control  group,  which,  in  turn,  had 
results  similar  to  previous  experiments  (e.g.,  Bar-Hillel,  1980). 

In  contrast,  the  Algorithm  and  Tutorial  conditions  showed  strllcing 
effects;  no  subjects  gave  a  reponse  equal  to  the  dlagnostlcity 
value  and  about  40Z  gave  responses  close  to  the  correct  response. 
For  the  Algorithm  group,  this  Improvement  did  not  generalize  to  the 
second,  Control,  problem;  that  distribution  looks  like  the  Control 
distribution.  One  might  suppose  that  if  the  algorithm  would  have 
any  generallzable  effect,  that  effect  might  be  limited  to  the  29 
subjects  who  arrived  at  about  the  right  answer  when  using  it. 
However,  when  presented  with  the  second,  control  problem,  16  of 
these  29  subjects  (55Z)  responded  with  the  dlagnostlcity  and  only 
one  gave  about  the  right  answer. 

The  Tutorial  group  did  appear  to  learn  something.  When  they 
were  given  the  Control  problem,  31Z  gave  about  the  right  answer 
whereas  only  9Z  gave  the  dlagnostlcity  value.  On  this  second 
problem  23%  were  able  to  come  up  with  the  correct  answer  accurate 
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to  tvo  decimal  places,  although  they  had  no  guidance  In  front  of 
them  for  doing  so. 

Is  your  answer  sensible?  After  completing  the  first  problem, 
the  subjects  In  the  Algorithm  and  Tutorial  groups  were  asked 
whether  the  answer  arrived  at  seemed  sensible  to  them.  The  answers 
to  this  question  are  shotm  In  Table  4.  For  both  groups,  the 
majority  of  subjects  who  answered  the  question  said  yes.  For 
neither  group  was  the  proportion  of  Yes  answers  significantly 
different  for  those  whose  answer  was  about  right  than  for  the  other 
subjects. 


Insert  Table  4  about  here 


The  Algorithm  group  were  then  asked,  "If  you  answered  No,  what 
do  you  think  Is  a  sensible  answer?"  Only  20  of  the  26  "No” 
subjects  gave  a  revised  answer.  Of  these  revised  answers,  only  one 
was  close  to  correct;  this  subject  had  perfectly  performed  the 
algorithm,  arriving  at  an  answer  of  .41  to  the  Llghtbulb  problem, 
but  said  that  a  sensible  answer  was  .35.  Eight  subjects  gave  the 
base  rate,  six  subjects  gave  the  dlagnostlclty,  and  there  were  five 
other  responses.  In  all,  12  of  the  20  revised  responses  were  In 
the  Too  Low  range,  supporting  the  finding  shown  In  Table  4  that 
most  of  the  Algorithm  subjects  who  had  originally  calculated  a  low 
number  found  It  sensible. 
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Errors.  The  algorithm  used  for  the  Algorithm  group  was 
complete  and  correct;  one  needed  only  follow  directions,  extract 
the  needed  information  from  the  story,  and  perform  simple 
arithmetic  to  arrive  at  the  correct  answer.  However,  only  22Z  of 
the  subjects  were  able  to  do  so.  Extracting  the  needed  information 
from  the  story  was  performed  incorrectly  by  47Z  of  the  subjects, 

32Z  made  errors  In  copying  a  number  from  one  place  in  the  algorithm 
to  another,  54Z  made  one  or  more  arithmetic  errors,  and  4Z  failed 
to  complete  the  algorithm. 

Arithmetic  errors  were  also  made  by  subjects  In  the  Tutorial 
group,  by  43Z  of  the  subjects  In  the  large  group  administration  and 
by  18Z  of  subjects  run  in  small  groups,  for  whom  hand~held 
calculators  were  available.  Ve  also  searched  for  conceptual 
errors,  to  see  if  our  tutorial  was  clear.  In  a  previous  version  of 
the  tutorial,  subjects  had  difficulty  identifying  the  base  rate. 

The  current  version,  therefore,  stressed  this,  with  apparent 
success;  84Z  of  all  subjects  correctly  Identified  the  base  rate 
and  allocated  the  appropriate  proportions  of  1000  to  the  two  places 
below  the  2x2  cable.  In  contrast,  our  subjects  had  difficulty  In 
allocating  numbers  to  the  four  cells.  The  most  common  error,  made 
by  38Z  of  Che  subjects,  was  to  put  the  right  numbers  In  the  wrong 
cells,  specifically  (as  exemplified  by  the  Dyslexia  problem): 


US.V.VV..O.  .o.  . 


-VlArt-JW.  JtfDWXT  K.'TH.'!  J  'T. 


^^JVVWVT^^'V’T'TJV'yVVV', 
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First  Graders 


Have  Dys.  No  Dys. 


Test  says: 


Have  Dys. 


No  Dys. 


980  1000 


The  tutorial  did  not  warn  about  this  particular  error. 

Discussion 

Base-rate  problems  are  difficult  problems.  Most  college 
students  cannot  do  them  correctly  vlthout  substantial  help. 

Indeed,  Eddy  (1982)  has  shown  that  the  authors  of  authoritative 
medical  texts,  who  presumably  have  much  more  education  and 
sophlscleaclon  than  college  students,  frequently  make  errors  In 
understanding  the  significance  of  base  rates  in  interpreting 
mammograms  (tests  for  breast  cancer). 

Our  least  potent  aid,  asking  subjects  to  list  relevant 
factors,  was  entirely  ineffective.  This  result  is  consistent  with 
the  view  that  subjects  do  not  have  the  knowledge  necessary  to  solve 
base-rate  problems.  Thus,  thinking  harder  about  the  problem 
doesn't  help. 

The  algorithm  improved  performance  only  when  it  was  in  front 
of  the  subjects;  it  had  no  effect  on  the  second,  unaided  problem. 
Our  instructions  did  not  suggest  that  the  subjects  shrmld  study  the 
algorithm  or  try  to  see  what  process  it  represented.  Apparently, 
the  subjects  got  caught  up  in  putting  the  right  numbers  in  the 
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right  places  without  gaining  any  Insight  Into  the  problem  or  Its 
solution* 

The  effective  aid  was  a  specially-written  tutorial  on  how  to 
solve  base-rate  problems.  When  the  tutorial  was  in  front  of  them, 
42Z  of  these  subjects  arrived  at  about  the  right  answer.  Moreover, 
when  presented  with  the  second,  control  problem,  31 Z  gave  about  the 
right  answer  and  only  9Z  gave  the  dlagnostlcity. 

There  were  two  main  barriers  to  success  in  the  tutorial 
condition.  First,  the  tutorial  appears.  In  retrospect,  to  have 
given  insufficient  attention  to  the  task  of  allocating  numbers  to 
cells.  This  conceptual  problem  might  be  rectified  by  re-wrlclng 
and  expanding  the  tutorial.  Second,  the  subjects'  elementary 
arithmetic  skills  were  weak. 

Nonetheless,  the  tutorial  approach  holds  great  promise. 
Although  It  appears  that  most  college  students  do  not  start  with 
the  knowledge  required  to  solve  base-rate  problems,  they  can  be 
taught  It  successfully  In  a  relatively  short  period  of  time  (about 
half  an  hour)  without  Individual  tutoring,  practise,  or  feedback. 

Two  further  problems  remain.  First,  people  who  are  taught  to 
perform  well  on  base-rate  problems  may  not  be  able  to  discriminate 
between  base-rate  problems.  In  which  bhelr  new  training  Is 
relevant,  and  other,  somewhat  similar  problems  that  cannot  be 
solved  using  this  approach,  as  the  results  of  Fischhoff  and  Bar- 
Ulllel  (I98A)  suggest.  Second,  those  trained  In  the  laboratory  on 
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story  problems  may  not  be  able  to  recognize  base-rate  problems  that 
arise  elsewhere. 

If  a  tutorial  could  be  written  that  solved  the  first  problem — 
when  not  to  use  the  technique — It  might  form  the  basis  for  a  larger 
educational  program  to  address  the  second  problem — recognizing  base 
rates  In  dally  life.  We  share  the  optimism  of  Nlsbett,  Kranz, 
Jepson,  and  Runda  (1983),  who  suggested  that  "training  In 
statistics  should  promote  statistical  reasoning  even  about  mundane 
events  of  everyday  life.  .  •"  (p.  347). 
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Table  1 

The  Base  Rate  Problems 


Ughc  Bulb 


Consider  the  follo%d.ng  problem: 

A  light  bulb  factory  uses  a  scanning  device  which  Is  supposed 
to  put  a  marie  on  each  defective  bulb  It  spots  In  the  assembly  line. 
Elghty-flve  percent  (8SZ)  of  the  light  bulbs  on  the  line  are  OK; 

Che  remaining  15Z  are  defective. 

The  scanning  device  Is  kno%m  to  be  accurate  In  80Z  of  the 
decisions,  regardless  of  whether  the  bulb  is  actually  OK  or 
actually  defective.  That  Is,  when  a  bulb  Is  good,  the  scanner, 
correctly  Identifies  It  as  good  80Z  of  the  time.  When  a  bulb  Is 
defective,  the  scanner  correctly  marks  it  as  defective  80Z  of  the 
time. 

Suppose  someone  selects  one  of  Che  light  bulbs  from  the  line 
at  random  and  gives  It  Co  Che  scanner.  The  scanner  marks  this  bulb 
as  defective. 

What  is  Che  probability  chat  this  bulb  is  really  defective? 
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Table  1  (continued) 


Dyslexia 


Dyslexia  Is  a  disorder  characterized  by  an  Impaired  ability  to 
read.  Two  percent  (22)  of  all  first  graders  have  dyslexia.  A 
screening  test  for  dyslexia  has  recently  been  devised  that  can  be 
used  with  first  graders.  The  screening  test  Is  cheap  and  easy  to 
administer;  It  identifies  those  children  who  will  later  be  given  a 
more  extensive  test  to  determine  for  sure  whether  the  child  has 
dyslexia.  The  screening  test  Is  not  completely  accurate.  For 
children  who  really  have  dyslexia,  the  screening  test  is  positive 
(Indicating  dyslexia)  952  of  the  time.  But  It  also  gives  a 
positive  (dyslexia)  result  for  52  of  the  normal  children,  the  ones 
who  do  not  have  dyslexia. 

A  first  grader  is  given  the  screening  test  and  the  result  Is 
positive,  indicating  dyslexia. 

What  is  the  probability  that  the  child  really  has  dyslexia? 


(A)  Out  of  1,000  light  bulbs  produced  by  the  factory,  how  many  are 
defective?  Multiply  the  percentage  of  defective  bulbs  by 


1,000.  (First  convert  the  percentage  value  to  a  decimal  value 
before  multiplying.) 


1,000  X _  - _ (A) 

Proportion  of 
Defective  Bulbs 

(B)  Subtract  your  estimate  In  (A)  from  1,000  to  get  the  number  of 
bulbs  out  of  1,000  that  are  NOT  defective. 

1,000  -  (A) _ •  •  -  _ _ (B) 

(C)  What  percentage  of  the  time  Is  the  scanner 
able  to  correctly  Identify  light  bulbs  that 

are  actually  defective?  (from  the  problem)  _ (C) 

(D)  What  percentage  of  the  time  Is  the  scanner 
able  to  correctly  Identify  light  bulbs  that  are 

actually  not  defective?  (from  the  problem)  _ (D) 


/. 

/. 

> 

v'l 


y 

,< 

» 


a 


% 
n; 


^4 


»  ] 


i 


I 


Box  #  1 

Box  #  4 

Box  #  2 

Box  #  3 
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Table  2  (continued) 

(I)  Subtract  your  value  in  (U)  fron  your  value  in 

(A  ) _  -  (H)  _  -  _ (I) 

Write  your  value  for  (I)  In  Box  #2. 

(J)  Multiply  the  percentage  value  In  (D)  by  your  estimate  from 
(B).  (First  convert  Che  percentage  value  to  a  decimal  value 
before  oulciplylng*) 

(B)  _  X  (D)  _  -  (J) 

Write  your  value  for  (J)  In  Box  13. 

(K)  Subtract  your  value  In  (J)  from  your  value  In  (B), 

(B)  _  -  (J)  _  -  (K) 

Write  your  value  for  (K)  In  Box  #4. 

(L)  Add  Che  numbers  In  Boxes  #1  and  #4. 

Box  #1 _ +  Box  #4 _  -  (L) 

Write  your  value  for  (L)  on  the  line  labeled  (L),  to  Che  right 
of  Che  boxes. 

(M)  To  get  Che  final  answer,  divide  your  value  in  Box  #1  by  your 
value  for  (L). 

Box  #l _ -i.  (L)  _  -  (M) 
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Table  3 

Distributions  of  Answers .  In  Percentages .  for  All  Groups 


Control 

List 

Algorithm 

Tutorial 

Both 

^st 

2nd 

ist 

2nd 

iSt 

2nd 

Too  Low 

21 

27 

20 

24 

30 

30 

35 

About  Right 

9 

7 

5 

38 

3 

42 

31 

(Exact) 

(4) 

(1) 

(0) 

(22) 

(0) 

(31) 

(23) 

Middling 

15 

14 

23 

7 

11 

6 

12 

Diagnostic 

48 

43 

45 

0 

51 

0 

9 

Way  High 

2 

5 

6 

5 

3 

7 

4 

Outside 

0 

0 

0 

22 

1 

4 

4 

None 

5 

5 

1 

4 

1 

11 

5 

No.  of  Ss 

41^ 

86 

76 

102 

^Each  of  41  subjects  contributed  two  responses  to  this  distribution. 
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Table  4 

Frequencies  of  Answers  to  the  Sensibleness  Question 


Algorithm 

Tutorial 

Yes 

No 

X  Yes 

Yes 

No  X 

Yes 

Too  Low 

14 

3 

82 

20 

11 

61 

About  Right 

16 

13 

55 

36 

6 

86 

Too  High 

6 

3 

67 

9 

4 

69 

Outside 

10 

7 

59 

3 

0 

100 

Total 

46 

26 

64 

68 

21 

76 

Not  answered 

4 

13 

;s 


Tutorial  (Llghcbulb  version) 


Consider  the  following  problea: 

A  cab  was  Involved  In  a  hit  and  run  accident  at  night.  TwC  cab 
companies,  Che  Green  and  the  Blue,  operate  In  the  city.  You  are  given 
Che  following  data: 

(a)  90Z  of  Che  cabs  In  Che  city  are  Green  and  lOZ  are  Blue. 

(b)  a  witness  Identified  the  cab  as  Blue. 

The  court  tested  the  reliability  of  the  witness  under  the  same 
circumstances  that  existed  on  the  night  of  the  accident  and  concluded 
that  the  witness  correctly  identified  each  one  of  the  two  colors  70Z  of 
the  time  and  failed  30Z  of  the  time. 

What  is  the  probability  that  the  cab  Involved  in  the  accident  was 
Blue  rather  than  Green? 

Research  has  shown  that  people  often  have  trouble  answering  problems 
like  this.  In  this  portion  of  today's  experiment,  we  are  presenting  you 
with  a  mini-tutorial  to  see  if  instruction  will  help  you  solve  such 
problems.  Please  read  through  the  tutorial  carefully.  We  have  allowed 
time  in  the  experiment  for  you  to  do  that. 

Tutorial 

The  class  of  problems  here  addressed  are  problems  for  which  two 
kinds  of  information  are  given  and  a  probability  is  requested.  One  kind 
of  information  is  about  the  population  or  populations  in  question.  The 
ocher  kind  of  information  is  specific  to  the  case  at  hand. 

In  the  problem  given  above,  the  population  is  the  population  of  cabs 
in  the  city.  The  population  information  is  that  90Z  of  the  cabs  are 
Green  and  lOZ  are  Blue.  The  specific  information  concerns  the  specific 
cab  chat  was  Involved  in  a  hit  and  run  accident.  The  witness  said  chat 
chat  specific  cab  was  Blue,  ^c  we  also  know  about  this  testimony  chat 
Che  witness  is  not  perfectly  accurate.  The  witness  is  able  to  correctly 
identify  the  color  of  the  cab  70Z  of  the  time. 

The  way  most  people  usually  go  wrong  in  solving  these  problems  is 
Chat  they  concentrate  Coo  much  on  the  specific  information  and  tend  to 
neglect  the  population  information.  Maybe  the  specific  information 
seems  more  immediately  relevant  to  them.  Or  perhaps  they  just  don't 
know  how  to  go  about  combining  Che  information  to  produce  a  single 
answer.  Here  is  a  way  of  doing  just  that: 
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Step  1.  Draw  •  table.  Begin  by  drawing  a  "two-by-two“  table,  that 
la,  a  diagram  with  two  rowa  and  two  coluana.  like  thla: 


Step  2.  Label  the  table*  We’ll  label  the  coluana  for  the 
population  information.  The  population  la  cabs  in  the  city,  which  are 
either  Blue  or  Green.  The  rowa  get  the  apecific  Information,  that  is, 
the  wltnesa  testimony,  which  was  Blue — but  for  completeness,  we'll  also 
label  the  ocher  row  Green,  because  the  tritnesa  could  have  said  Green. 

So  now  our  cable  looks  like  this: 

Cabs  in  the  City 
Blue  Green 

Blue 

Witness  said: 

Green 


Labeling  the  table  is  not  quite  as  simple  as  it  may  first  appear. 
Notice  that  the  sub-labels,  "Blue"  and  "Green",  are  the  same  for  the 
rows  and  the  columns.  This  should  generally  be  true  in  such  problems. 

It  would  be  a  mistake  to  label  the  rows  according  to  whether  Che  witness 
was  accurate  or  inaccurate: 

Right 

Witness  was: 

Wrong 

The  problem  could  be  solved  with  such  labeling,  but  not  using  the  method 
we  are  teaching  you  here.  In  general,  the  sub-labels  are  the  two 
possible  states  of  the  world.  The  main  labels  (e.g.,  "Cabs  in  the  City" 
and  "Witness  said:")  indicate  the  source  of  information.  One  source  is 
always  population  information  (here,  the  relative  number  of  cabs  in  the 
city);  the  other  source  is  always  specific  information  (here,  what  the 
witness  said). 


Notice  that  If  there  were  numbers  in  the  four  cells  of  the  table,  we 
could  calculate  row  totals  and  column  totals  and  a  grand  total  for  the 
whole  table.  The  places  for  these  totals  are  shown  below  with  dashed 
lines. 


Cabs  in  the  City 
Blue  Green 


Row 

Totals: 


Witness  said: 


Green 


Column  Totals: 


Grand  Total 


Step  3.  Assign  an  arbitrary  grand  total.  To  get  started,  we'll 
fill  in  the  grand  total.  That  should  be  the  total  number  of  cabs  in  the 
city.  But  we  don't  know  how  many  cabs  there  are  in  the  city.  So  we 
pick  an  arbitrary  total  of  1,000.  We  could  use  10  or  100  (or  any  other 
number),  but  using  1,000  will  make  later  calculations  easier. 

Cabs  in  the  City 
Blue  Green 


Witness  said: 


Green 


Step  Estimate  the  population  totals.  If  there  were  1,000  cabs 
in  the  city,  how  many  of  them  would  be  Blue?  According  to  the  story, 
lOT  are  Blue.  That  means  10  out  of  every  100  or  100  out  of  every  1,000 
are  Blue.  That  number,  100,  is  the  left  column  total.  The  rest  are 
Green.  So  1,000  -  100  «  900  is  the  right  column  total.  We  put  these 
column  totals  into  the  table: 

Cabs  in  the  City 
Blue  Green 


Witness  said; 


Green 


lOO 


900  \\000 
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WARNING*  The  method  we 're  teaching  you  for  aolvlng  theae  problems 
won't  work  if  you  start  out  estimating  the  «rrong  totals.  It's  Important 
In  this  step  to  correctly  Identify  which  part  of  the  problem  gives 
population  Information  and  which  gives  specific  Information.  The 
population  Information  la  general,  background  Information  that  does  not 
luvilcacte  «ajr  kpeclflc  case.  The  specific  inxormacion  fingers  a 
particular  case. 

Step  5.  Fill  in  the  cells.  Working  with  each  total,  divide  it 
among  its  two  cells.  First,  for  the  100  Blue  cabs,  how  many  would  the 
witness  correctly  see  as  Blue,  and  how  many  would  the  witness 
incorrectly  see  as  Green?  The  story  states  that  the  witness  Is  correct 
70Z  of  the  time.  So: 

100 
X  .70 

70  Is  the  number  of  Blue  cabs  the  witness  would  correctly 
call  Blue,  and  the  remaining,  100  •  70  -  30,  are  the  number  of  Slue  cabs 
the  witness  would  Incorrectly  call  Green. 

Now  consider  the  900  Green  cabs.  Again  the  witness'  accuracy  Is 

70Z: 

900 
X  *  70 

630  Is  the  number  of  Green  cabs  the  witness  would  have 
correctly  called  Green.  This  number,  630,  goes  in  the  Green-Green  cell. 
The  rest  of  the  Green  cabs,  900  -  630  •  270,  is  the  number  of  Green  cabs 
the  witness  would  have  Incorrectly  called  Blue. 


Our  table  now  looks  like  this: 


Witness  said: 


Blue 


Green 


Cabs  in 

the  City 

Blue 

Green 

70 

ZJO 

30 

loo 

^OO 

1000 

Comment.  Notice  that  we  now  could.  If  we  wished,  find  the  last  two 
totals,  the  total  number  of  times  the  witness  would  have  said  "Blue,** 
rightly  or  wrongly; 

70  +  270  -  340 

and  the  total  number  of  times  the  witness  would  have  said  “Green,'* 
rightly  or  wrongly: 

30  +  630  -  660. 

These  totals  are  not  intuitively  obvious.  The  reason  is  that  these 
totals  are  the  total  number  of  times  the  witness  says  ‘‘Green"  and 
"Blue.'*  What  the  witness  says  depends  not  only  on  the  witness'  accuracy 
but  also  on  the  relative  proportions  of  Blue  and  Green  cabs  the  subject 
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might  have  teen.  You  have  to  take  both  these  facta  into  conalderatloo 
to  calculate  the  totals.  In  contrast,  the  population  totals  make  a  lot 
of  sense,  because  they  depend  on  only  one  kind  of  information,  not  two 
kinds.  The  total  number  of  Blue  cabs  In  the  city  is  directly  calculated 
as  a  percentage  of  the  total  number  of  cabs,  regardless  of  what  the 
witness  might  testify.  This  distinction  Is  Important  because  It  shows 
you  another  way  of  telling,  In  any  problem,  which  Is  the  population 
Information  (that  you  start  with  in  Step  #4)  and  which  Is  the  specific 
Information.  The  population  Information  la  Information  that  directly 
translates  Into  number  totals.  The  specific  Information  Is  Information 
that  does  not  translate  Into  number  totals  because  those  number  totals 
depend  not  only  on  the  specific  Information  but  also  on  the  population 
Information. 


In  summary,  here  are  two  criteria  (one  discussed  earlier)  for 
telling  which  la  which: 

The  population  Information: 

(a)  is  general,  background  Information  and 

(b)  can  be  translated  directly  Into  number  totals. 

The  specific  information: 

(a)  specifies  or  Identifies  one  case  and 

(b)  cannot  be  directly  translated  Into  number  totals  because  those 
totals  also  depend  on  the  population  Information. 


Step  6.  Cross  out  the  false.  The  witness  la  the  story  in  fact 
testified  that  the  cab  was  Blue.  So  the  number  of  times  the  witness 
might  have  said  "Green*  Is  irrelevant  to  the  problem.  We  cross  out 
these  false  cells  so  we  won’t  be  tempted  to  use  them  In  the  next  step: 


Cabs  In  the  City 
Blue  Green 


Step  7.  Find  the  needed  probability.  The  two  remaining  cells  are 
what  we  need  to  answer  the  question.  They  show  that  the  witness  would 
have  said  "Blue"  correctly  70  times  and  would  have  said  "Blue" 
incorrectly  270  times.  From  these  two  numbers  we  can  get  our 
probability. 

If  you're  not  used  to  thinking  about  probabilities,  a  nice  way  to 
think  about  them  is  to  imagine  that  you  fill  an  um  with  70  balls 
labeled  "cab  is  really  Blue"  and  270  balls  labeled  "cab  is  really 
Green,"  for  a  total  of  340  balls.  Now  sample  one  ball  at  random  from 
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Che  urn*  What  it  the  probability  that  Che  ball  will  be  labeled  'cab  la 
really  Blue?"  The  answer  la  Che  number  of  'cab  la  really  Blue"  balls 
divided  by  che  total  number  of  balls  In  the  urn: 

-  70  a  70  _  -  .2/  <w«ll  ,  it's  really  .2058... but  we  rounded  it) 

1C  >  Z70  3H0 

In  other  words,  we  divide  the  number  in  the  TARGET  cell  by  the  sun  of 
the  two  numbers  left  in  our  table.  The  TARGET  cell  is  the  one  cell 
identified  by  both  the  specific  information  given  in  the  problem  ("a 
witness  Identified  the  cab  as  Blue')  and  the  question  asked  at  the  end 
of  the  problem  ("What  is  the  probability  that  the  cab  involved  in  the 
accident  was  Blue?').  So  the  target  cell  is  the  'Cab  is  Blue/Vitness 
said  Blue'  cell. 

That's  it.  The  answer,  .21,  is  the  probability  that  the  hit-and-run 
cab  was  a  Blue  cab. 

Are  you  surprised  by  the  answer?  Most  people  think  that  the  correct 
answer  should  be  .70,  the  same  as  the  witness'  accuracy.  They  tend  Co 
forget  the  population  information,  that  is,  they  fail  to  notice  chat 
because  there  are  so  many  more  Green  cabs  than  Blue  cabs,  there  are  also 
many  more  opportunities  for  the  witness  to  be  wrong  when  saying  Blue. 

Comment.  While  it's  not  necessary  to  solve  the  problem,  it  might 
help  you  to  understand  what's  going  on  by  thinking  about  this:  What  if 
the  witness  had  testified  that  the  cab  was  Green?  Look  back  at  che  last 
table,  che  one  with  two  crossed-out  cells.  Those  crossed-out  cells  show 
30  really  Blue  cabs  and  630  really  Green  cabs.  So  che  probability  that 
che  cab  is  really  Green,  if  che  witness  said  it  «as  Green,  Is: 

C30  ^  ^3  O  a 

This  probability  is  higher  chan  either  che  proportion  of  Green  cabs  in 
che  city  (90Z)  or  che  accuracy  of  che  witness  (70Z).  That's  because  in 
this  case  both  pieces  of  information — che  population  proportion  and  the 
witness'  CesClmony,  point  in  the  same  direction.  Cowards  Green. 

Intermedlace  probabilities  like  .21  are  found  only  when  che  cwo 
pieces  of  Infomatlon  point  In  opposite  directions;  the  witness  said 
Blue  but  most  cabs  are  Green. 


!•  That's  the  end  of  che  tutorial.  On  the  next  page  is  a  problem  for 

I,  you  to  do.  Before  doing  the  problem: 

^  1.  Review  che  tutorial  to  make  sure  you  understand  it. 

k  2.  Ask  any  questions  you  have. 
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When  you  are  ready,  proceed  to  the  problem  on  the  next  page.  We  are 
interested  in  how  effective  the  tutorial  is  in  teaching  you  how  to  do 
such  problems.  So  while  you  are  doing  the  problem,  feel  free  to: 

1.  Review  the  tutorial  again. 

2.  Use  a  hand  calculator. 

3.  Ask  questions* 


I 
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Pleaaa  work  the  following  problem  using  the  method  just  described. 

We've  drawn  you  a  table  to  work  with. 

A  light  bulb  factory  uses  a  scanning  device  which  is  supposed  to  put 
a  mark  on  each  defective  bulb  it  spots  in  the  assembly  line.  Eighty- 
five  percent  (8SZ)  of  the  light  bulbs  on  the  line  are  OK;  the  remaining 
15Z  are  defective. 

Tu«  acaaoing  device  is  known  to  be  accurate  in  8CZ  cf  the  dccislcr.c, 
regardless  of  whether  the  bulb  is  actually  OK  or  actually  defective. 

That  is,  when  a  bulb  is  good,  the  scanner  correctly  indentifies  it  as 
good  80Z  of  the  time.  When  a  bulb  is  defective,  the  scanner  correctly 
marks  it  as  defective  80Z  of  the  time. 

Suppose  someone  selects  one  of  the  light  bulbs  from  the  line  at 
random  and  gives  it  to  the  scanner.  The  scanner  marks  this  bulb  as 
defective. 

What  is  the  probability  that  this  bulb  is  really  defective? 


Step  1 .  Draw  a  table.  Done . 


Step  2.  Label  the  table. 


Step  3.  Assign  an  arbitrary  grand  total.  Use  1 , 000 . 


Step  4.  Estimate  the  population  totals.  First  decide  which  set  of 
information  is  population  information.  Then  divide  the  1,000  into  two 
parts,  using  inform.-' cion  from  the  problem. 

Step  5.  Fill  in  Che  cells.  Divide  each  of  your  estimated  totals  among 
its  two  cells,  according  to  the  Information  in  the  problem. 

Step  6.  Cross  out  the  false.  Cross  out  the  two  cells  that  are 
contradicted  by  the  information  given  in  Che  problem. 

Step  7.  Find  the  needed  probability.  Write  the  relevant  numbers  in  the 
top  and  bottom  of  the  fraction  and  convert  the  fraction  to  a  decimal 
answer. 


#  in  target  cell 
Sum  of  #'s  in  both  cells 


